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With the rapid growth of the Internet and overwhelming amount of information that people are confronted 
with, recommender systems have been developed to effectively support users' decision-making process in 
online systems. So far, much attention has been paid to designing new recommendation algorithms and 
improving existent ones. However, few works considered the different contributions from different users to 
the performance of a recommender system. Such studies can help us improve the recommendation 
efficiency by excluding irrelevant users. In this paper, we argue that in each online system there exists a 
group of core users who carry most of the information for recommendation. With them, the recommender 
systems can already generate satisfactory recommendation. Our core user extraction method enables the 
recommender systems to achieve 90% of the accuracy of the top-i recommendation by taking only 20% of 
the users into account. A detailed investigation reveals that these core users are not necessarily the 
large-degree users. Moreover, they tend to select high quality objects and their selections are well diversified. 

The Internet nowadays provides us with abundant online contents, which makes it very time-consuming to go 
over every detail and find our needed information. This is often referred as the information overload 
problem. In order to solve it, search engines and recommender systems are widely investigated' ^\ The 
search engine returns the relevant contents based on the keywords given by users. Compared to the search engine, 
the recommender system provides more personalized services by predicting the potential interests according to 
users' historical choices. These techniques have already been successfully applied to some well-known web sites, 
such as google.com, amazon.com, taobao.com dind youtube.com. 

For recommendation algorithms, the most famous one from computer science is the socalled collaborative 
filtering (CF) with user-based and item-based versions^ '"''. The user-based CF estimates each user's preferences by 
referring to her similar users' tastes, while the item-based CF recommends items which are similar to the target 
user's selected items. Recently, some physical concepts have been introduced to recommendation algorithms. 
Since recommender systems can be naturally represented by user-object bipartite networks* '°, some classic 
network-based propagation processes such as mass diffusion" '^ and heat conduction'^, are applied to find the 
most relevant objects for users. The hybridization of these two propagation-based methods can effectively solve 
the diversity-accuracy dilemma in recommendation'''. Based on these algorithms, many extensions have been 
made. For example, the preferential diffusion'^ the biased heat conduction"" and network manipulation'^ are able 
to further improve the recommendation accuracy for small-degree objects (i.e. solving the cold-start problem). 
More recently, the long-term influence of different recommendation algorithms on network evolution has been 
studied". 

Related works overwhelmingly focus on designing new algorithms, while the effects of the underlying user- 
object bipartite networks on the recommendation results are seriously overlooked, to the best of our knowledge. 
More specifically, the relevance of individual users on the recommendation process has not yet been well 
addressed. In online systems, it is reasonable to imagine that there are some "expert" users who know well about 
objects qualities in certain fields. By referring to them, the recommender systems can generate satisfactory 
recommendations for the user who have common interests with these expert users. Besides, there are some 
malicious online users who seek to bias the output of the recommender systems". Eliminating these attackers is 
very meaningful to enhance the robustness of the recommender systems''". Therefore, investigation on users' roles 
in recommendation can improve the efficiency as well as the robustness of recommendation algorithms by 
excluding irrelevant and unreliable users. 

In individual level, it is already pointed out that considering Kmost similar users to the target user can improve 
the recommendation accuracy under the user-based collaborative filtering framework (known as the "KNN" 
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Figure 1 | A visualization of KNNMD methods. Ui is the target user and two neighbors are selected by the similarity-based method. Results from degree- 
based and resource-based methods are also shown. 



method)^. In this paper, we find that such phenomenon also exists in 
system level, i.e., one can achieve satisfactory recommendation for all 
users by only referring to a small group of core users. We first study 
the relevance of users in a recommender system and find that there 
exists an "information core" consisting of some key users. The size of 
the core users is around 20 percent of the whole system. The recom- 
mendation accuracy by relying only on the core users can reach 90 
percent of that with all users. This is very meaningful from practical 
point of view since it can significantly speed up the recommendation 
process in real online systems. Moreover, the analysis in this paper is 
helpful for the online-retailers to categorize costumers and provide 
better personalized services for them. 

Results 

A recommender system can be naturally represented by a bipartite 
network G(U, O, E), where U = M2, u„}, O = {oi, 02, o,„} 
and E = {ei, 62, . . ., e;} are sets of users, objects and links, respectively. 
The bipartite network is denoted by an adjacency matrix A, where the 
element Ujy = 1 if user / has collected object a, and 0 otherwise (we 
use Greek and Latin letters, respectively, for object- and user-related 
indices). The degree of an object a and a user i, and fc„ represent 
respectively the number of users who have collected object a and the 
number of objects collected by user /. For a target user to whom we 
will recommend objects, each of her uncollected objects will be 
assigned a score by the recommendation algorithm and the top-L 
objects with the largest scores wUl be recommended. Different algo- 
rithms generate different object scores and thus different recom- 
mendation lists for users. 

The mass diffuse" (MD for short) algorithm works by assigning 

objects an initial level of "resource" denoted by the vector / (where 
/a is the resource possessed by object a), and then redistributing it via 



the transformation /' = W f , where 



_ 1 aj^ajp 



is a col- 



umn-normalized m X m matrix. For a target user, the resulting 
recommendation list of uncollected objects is sorted according to 

/' in descending order and top-L objects with the most resources 
will be recommended. 



The MD method can be described in a more intuitive way: The 
initial resources placed on objects are first evenly divided among 
neighboring users and then evenly divided among those users' neigh- 
boring objects. In a real network, there can be a lot of neighboring 
users who have common objects with the target user. We argue here 
that only a few of the most relevant neighboring users should be 
taken into account in the diffusion. By doing this, there will be less 
computation in recommendation and the noisy information from 
the less relevant users can be reduced. Accordingly, we propose the 
K-Nearest Neighbor Mass Diffuse (KNNMD) method in which only 
the K nearest neighbors of the target users will be considered. Four 
different ways can be used to identify the most relevant neighbors: ( 1 ) 
random; (2) degree-based; (3) resource-based; (4) similarity-based 
ones. When the resources are located at the user side, the random 
method randomly selects K users as the neighbors; the degree-based 
method selects K users with the largest degrees as neighbors; and the 
resource-based method selects K users with the largest received 
resources as the neighbors. The similarity-based method is a bit more 
complicated than the previous three methods. Firstly, we compute 
the similarities between the target user and other users. The cosine 



index"" is used to measure the similarity: Sy^^FiCWj^ ^ ykikj, 

where F, is the set of objects collected by user /. The similarity-based 
method selects K users with the highest similarities to the target user. 
A visual representation of KNNMD is given in Fig. 1. 

We compare the above four KNNMD methods on three real 
datasets: Douban, Lastfm and Flickr (see details about the datasets 
in Methods). The metric recall (see the definition of recall in 
Methods and the definition and results for more metrics in SI) is 
chosen to measure the accuracy of recommendation algorithms. A 
higher recall value is corresponding to a higher recommendation 
accuracy. The results of these KNNMD methods are presented in 
Fig. 2. It can be seen that the best method is the similarity-based 
KNNMD which outperforms the standard MD method for K > 20 
in Douban, s 20 in Last.fm and S 40 in Flickr, respectively. The 
optimal neighbor number K* of this method is around 180 in 
Douban, 300 in Last.fm and 280 in Flickr, respectively (see Table 
S2 in SI). Moreover, one can see that the accuracy of the MD method 
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Figure 2 | The accuracy of KNNMD methods. The recommendation length L is set to 20. 



is significantly improved by reducing the less relevant neighboring 
users (see SI for details). 

Notice that, the above analysis is at the individual level and the 
selected K neighbors for different individuals are different. The nice 
performance of KNNMD arises an important question: in the system 
level, which kinds of users are the most relevant ones for recom- 
mending objects for all users. We denote this group of users as the 
information core in the recommender system. 

We thus propose four approaches to assess the relevance of users 
and find the information core. The most straight-forward one is 
simply based on the degrees of users, with an underlying hypothesis 
that the relevance of a user can be reflected by her degree, and the 
information core consists of users with the largest degrees. The sec- 
ond one is to randomly select a set of users as the information core. 
This method is used as a benchmark for comparison. In the third 
method, we first compute the top-N(e.g. N = 10, 20, 50) most similar 
neighbors of each user based on the cosine similarities, and then 
count how many times a user has appeared in other users' top-N 



lists. Those users who appear most frequently are selected as the 
information core. The fourth one is similar to the third one but takes 
into account the ranks of each user in other users' top-N neighbor 
lists. Suppose user ; belongs to user j's top-N neighbors and his 
position is pth, then the score of i is 1/p. If / also appears in other 
users' top-N neighbor lists, we sum his scores as his final weight: w, = 
2;,N0)3i l/pip where N(j) is the top-N neighbor set of user; andj runs 
over all users whose N{j) set contains pjj is i's position in/s top-N 
neighbor list. Finally, those users with the largest sums will be 
selected as the information core. A toy example of the frequency- 
based and the rank-based methods to find the information core from 
the network in Fig. 1 is illustrated in Fig. 3. 

To study the importance of the information core in recommenda- 
tion, we make use of four recommendation algorithms: MD", sim- 
ilarity-based KNNMD (in the following, KNNMD refers to the 
similarity-based KNNMD), the hybrid of the mass diffusion and heat 
conduction" (Hybrid for short and the details are presented in 
Methods) and user-based collaborative filtering^ (UCF for short 
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Figure 3 | The frequency-based and rank-based method to find the information core of the user-object bipartite network in Fig. 1. For each user, we 
select her top-2 neighbors. The size of the information core (number of users) is set to be 2, and the information cores are {u2, U4] and {1/2, Wi} 
by the frequency-based and rank-based method, respectively. 
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Figure 4 | The information-core-based mass diffuse, (a) is the traditional mass diffuse method which consider all the users and (b) is the information- 
core-based mass diffuse which only takes into account information core users. In (b), all the irrelevant Unks have been removed. The information 
core users are Uj and U4 which are identified by the frequency-based method. 



and the details are presented in Methods). We firstly compute the 
accuracy of each algorithm in the traditional way, i.e. using all users 
in the system. We also compute its accuracy when only the users in 
the information core are taken into account. Given the information 
core C and the target user only the users in C will receive the 
resources from is collected objects in the MD and Hybrid methods. 
Other users will not receive resources even though they have com- 
mon objects with i. Then the users who have received resources 
redistribute the resources back to the object side. For the KNNMD 
method, we firstly compute is top-K neighbors who are in the 
information core C and then only these K neighbors will receive 
resources and redistribute them. Similarly, the top-K neighbors will 
be limited in C in the UCF method. This procedure is equivalent to 
removing non-core users from the network However, we stiU make 
recommendations to these non-core users. The importance of the 
information core in recommendation can be seen by comparing the 
accuracy contributed by the core to that of the traditional methods. 
The comparison of traditional mass diffuse and the information- 
core-based mass diffuse is presented in Fig. 4. 

We use again the recall metric to measure the accuracy of the 
algorithms (the results of the precision metric are quite similar, see 
SI). The results are presented in Fig. 5 where r denotes the fraction of 
users in the information core. When r= 1, all the users will be used in 
the recommendation algorithms, equivalent to the traditional 
method. Generally speaking, the recommendation accuracy tends 
to decrease with r since the available information for the recom- 
mendation algorithm is less. The accuracy decreases slower when 



we choose the rank-based method to identify the information core. 
Taking the KNNMD method for the Douban data for example, 
91.4% (0.1886/0.2063) of the accuracy can be achieved when we only 
use 20% of users (r = 0.2). Specifically, for the MD method in the 
Douban data, the accuracy with only 20% users (r = 0.2) can be even 
slightly better than that with all users (r = 1). Similar results are also 
observed in the other two datasets. This is of great importance since 
the algorithmic efficiency of recommendation methods can be lar- 
gely improved if we consider fewer users. In Fig. 5, in some cases the 
random method performs even better than the degree-based method. 
This is because users in the degree-based information core tend to 
choose smaU-degree items'". Therefore, recommendation based on 
these large-degree users wiU mainly include the niche (smaU-degree) 
items, and the recommendation accuracy is low accordingly. For the 
random-based method, it selects core users randomly. Though the 
core users are not selected based on calculation but they are well 
separated. These users' selections are diversified so the recommenda- 
tion results are better than the degree-based method. However, the 
random-based methods cannot outperform the frequency-based and 
rank-based methods. Moreover, it must be noted that some non-core 
users might be isolated since these users are not selecting any com- 
mon objects with the core users. We define the non-isolated users as 
those who have collected at least one common object with the core 
users. We find that the ratio of non-isolated users is more than 99.9% 
even when the r = 0.1, indicating a small fraction of users who have 
no common items with the core users. In this paper, we randomly 
recommend L objects for these isolated users since they are trivial to 
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Figure 5 | The recommendation accuracy contributed only by the information core in the recommender systems. The recommendation length L is set 
to 20. For the frequency-based and rank-based method, we select each user's top-20 neighbors, r is the ratio of the size of the information core to 
the whole system. The error bars are obtained based on 5 independent instances of training and probe set. 



our conclusions. From the practical point of view, one can recom- 
mend the most popular objects for them. 

From the above results, it can be seen that the rank-based 
method is better than the frequency-based method in identifying 
the information core, which indicates that the rank of a user 
appearing in others' top-N neighbor list matters when assessing 
her relevance in the recommender system. If a user appears in 
most users' top-N neighbor list with high rank, she should be 
considered as the key member in the online system since many 
users' recommendation will rely on her. Both methods are gen- 
erally better than the random and degree-based methods. Among 



these methods, the degree-based method is the worst, which indi- 
cates that the large-degree users are not for sure the "expert". 
Taking the MD method in the Douban data for instance, the 
accuracy of the degree-based method is much lower than that of 
the rank-based method when r = 0.2. In many previous works 
about real networks with heterogenous degree distribution, atten- 
tion has been overwhelmingly paid to the hubs (nodes with largest 
degree). Our finding here suggests that degree may not be the 
proper criterion to judge the importance of nodes in the informa- 
tion filtering process, perspectively analogous to the week ties 
effects in information filtering"^^. 
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Douban 






riicKr 




Last.fm 






iK) 


iK.) 


iK) 


(K) 




iK) 


Degree-based 


416.3 


261.2 


153.9 


449.1 


49.7 




388.1 


Random-based 


93.5 


382.9 


52.4 


491.1 


39.7 




408.5 
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85.8 


609.4 


48.0 


682.5 


39.8 




614.3 


Rank-based 


83.4 


582.5 


43.4 


654.0 


39.6 




575.7 



Apart from the recall, we also consider a global accuracy metric 
called ranking score (see the results in SI) For a target user, all her 
uncollected items will be given a rank by the recommendation algo- 
rithm and the average position of the objects in the probe set is 
defined as the ranking score, which can be used to measure the 
accuracy of algorithms. The smaller the ranking score, the better 
the recommendation. Compared to the recall result, the best method 
is degree-based information core method instead of the rank-based 
method. This is because more objects will receive the resource if we 
choose large-degree users as the core users. Moreover, the ranking 
scores of all information-core-based methods are much worse than 
the corresponding standard ones which consider all users. In fact, 
any attempt to reduce the user number in recommendation will 
incline to increase the ranking score''\ However, measuring the accu- 
racy of top-L objects in individual's recommendation lists is actually 
more important from the practical point of view since in real recom- 
mender systems individuals are only presented with top-L objects. 
On the other hand, we also compare the diversity of information- 
core-based methods with their traditional ways. Our results indicate 
that the rank-based information core generally increases the recom- 
mendation diversity (see the results in SI). 

We then investigate the structural properties of information cores. 
For simplicity, the relative size of the information core (r) is set as 0.2. 
After obtaining the core users from different methods, we find that 
core users detected by rank-based and frequency-based methods are 
highly overlapped, but for any other pair of methods, there is only a 
small ratio of overlapping core users. We then compute the average 
degrees of information core users and the average degree of the 
objects selected by these core users (A:^). The result is presented in 
Table 1. It can be seen that the (fc^) in the degree-based information 
core is the largest and the (^fcjj) in the rank-based information core is 
the smallest. It indicates that our core users are not necessarily the 
large degree users. Moreover, ^fc^,^ of the rank-based information 
core is large, indicating that our core users indeed tend to select 
the high quality objects. On the contrary, of the degree-based 
information core is very small as shown in Table 1. The detailed 
distribution of core users in different ^fcj|) can be seen in SI. 

Secondly, we investigate the intra-similarity ((s„,)) and inter-sim- 
ilarity {{Sg^)) of the core. The intra-similarity is defined as the average 
cosine similarity between core user pairs. The inter-similarity is 
defined as the average cosine similarity between all user pairs each 
of which consists of one core user and one non-core user. The result 
is presented in Table 2. For the random-based method, it is natural 



that the intra-similarity and inter-simUarity are both low. This is 
because the core users are randomly selected. However, one can 
observe that both the intra-similarity and inter-simUarity of the 
rank-based method are smaller than those of the degree-based 
method. It indicates that our core users are well diversified. One 
can see that the result is different in the Last.fm data, this is because 
the user degree in this network is more homogeneous. 

According to the above analysis, it is clear that the core users from 
these methods have different properties. In order to further under- 
stand their roles in network and recommendation, we consider three 
indices: degree heterogeneity, clustering coefficient, diffusion cov- 
erage (See the results in SI). For each real network, we first construct 
a corresponding sub-network which only consists of core users. We 
study the item degree heterogeneity and clustering coefficient in the 
sub-networks. The results indicate that the frequency-based and 
rank-based core users tend to connect to some common items while 
the degree-based core users' links are more evenly distributed among 
items. In fact, the clustering coefficient has been shown to closely 
related to the efficiency of the recommendation process''^ The higher 
clustering coefficient of the rank-based core users explains why this 
method leads to a better recommendation accuracy than others in 
Fig. 5. Many recommendation methods are based on a three-step 
diffusion (even the well-known collaborative filtering can be 
regarded as a diffusion process). The diffusion normally starts from 
the target user. In the first step, it finds the objects selected by the 
target user. In the second step, the users who selected the same 
objects as the target user are found and they are referred as relevant 
users. In the last step, the items selected by the relevant users are 
found and they are called relevant items. The relevant items with 
highest diffusion resource will be recommended to the target user. 
Obviously, the smaller the number of relevant items is, the stronger 
the filtering effect of the diffusion is. We find that if degree-based 
core users are used, the diffusion coverage is the same as the case 
where all users are used, indicating a poor filtering effect. If the 
frequency-based or rank-based core users are used, the diffusion 
coverage is significantly narrowed, such that only the most relevant 
items can be reached by the diffusion in this case. 

To obtain the information core, one needs to compute similarities 
over all user pairs. Therefore, the complexity of obtaining the core 
users is O(rr'm). Once we get the information core, the computation 
time of recommendation algorithms will be shortened at most 5 
times (r = 0.2). Moreover, although it takes time to get the informa- 
tion core, the core is quite relatively stable in real systems. Taking the 
Douban dataset for instance, more than 90% information core users 



Table 2 | The intra-similarity and inter-similarity of the core 

Douban Filckr Last.fm 



Methods (S,„> (Sex) (sj <Sex) (s,-„> (Sex) 



Degree-based 0.0421 0.0194 0.0353 0.0144 0.0171 0.0164 

Random-based 0.0145 0.0146 0.0104 0.0106 0.0161 0.0163 

Frequency-based 0.0281 0.0176 0.0176 0.0123 0.0320 0.0202 

Rank-based 0.0242 0.0169 0.0145 0.0116 0.0292 0.0197 
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Table 3 

Dataset 


The statistics of Douban, Last.fm and Flickr datasets. The sparsity is defined as 
#users, n #objects, m 


I 

nxm 

#links, 1 


sparsity 


Douban 


17,000 


223,823 


2,109,749 


5.54 X 10-" 


Last.fm 


30,000 


87,082 


1,467,235 


5.62 X 10-" 


Flickr 


30,000 


61,352 


1,924,461 


7.37 X 10-" 



(frequency-based and rank-based) stay the same in two adjacent 
months. Therefore, it is enough to update the information core once 
a month, which will significantly reduce the computational cost. 
Therefore, our method is meaningful in practice. 

Discussion 

During the past decade, recommender systems have been widely 
investigated in several research fields, including computer science, 
physics, sociology and so on^. Up to now, a lot of recommendation 
algorithms have been proposed. However, little attention was paid to 
studying the effect of the underlying user-object bipartite network on 
recommendation process. In this paper, we study the relevance of 
individual users and find that there exists an information core whose 
size is small compared to the whole network. The users in the 
information core usually appear in many users' top-N neighbor lists 
with high ranks. For many recommendation algorithms, one can 
achieve very good recommendation accuracy by only using the core 
users. Actually, similar idea can be extended to the item-based col- 
laborative filtering. One can use only the links of those core users to 
calculate the items' similarity matrix and obtain accurate recom- 
mendation^"". This work may find wide applications in practice. For 
one thing, it can significantly speed up the recommendation process 
in real online systems since the recommendation engine only has to 
deal with a small fraction of data. For another, the analysis in this 
paper can be also helpful for the online-retailers to categorize custo- 
mers and provide better personalized services for them. 

There are still many open issues, such as extending similar tech- 
nique to monopartite networks for link prediction^^ Another inter- 
esting open issue is to study the location of these core users in the 
network. Specifically, one can investigate whether the core users are 
diversely distributed in different communities. Related study may 
lead to some better topology-based method to identify the core users 
in networks. Finally, the evolution of the information core is also an 
important topic. A relatively stable information core over time will 
lower the frequency to update core users, and thus further reduce the 
computational cost in practice. 

Methods 

Data description. We use three datasets to test the accuracy of algorithms, namely 
Douban^^, Last.fm'^ and Flickr^^. Douban (www.douban.com), launched on March 6, 
2005, is a Chinese Web 2.0 web site providing user rating, review and 
recommendation services for movies, books and music. It is also the largest online 
Chinese book, movie and music database and one of the largest online communities 
in China. The raw data contains user activities before Aug 2010 and we randomly 
sample 17,000 users who have collected at least ten songs. The Last.fm (www.last.fm) 
is a worldwide popular social music site and the objects in this dataset are referred to 
the artists which can be collected from Last.fm API. The raw data consists of 360,000 
users and we randomly sample 30,000 users who have collected at least five items 
(artists). Flickr (www.flickr.com) is a photo-sharing site based on a social network. 
The data used in this paper is individuals' group membership in Flickr, which refers to 
the their participation in groups. Accordingly, we provide group recommendations 
for users instead of objects''^-'*'. We randomly sample 30,000 users who have joined at 
least ten groups. We treat the user-object (user-group) interaction matrix as binary, 
that is, the element equals to 1 if the user has viewed or rated the object (joined the 
group) and 0 otherwise (see Table 3). In this paper, we filtered out those users whose 
degrees are smaller than 10 (5 for the Last.fm), as it is very difficult to recommend 
items for those small-degree users. 

Evaluating recommender systems. Each data is randomly divided into two parts; the 
training set and the probe set E^. The training set contains 80% of the original links 
and the recommendation algorithm runs on it^'. The rest of the links forms the probe 
set, which will be used to assess the performance of the recommendation algorithm. 



The result is obtained by averaging over five runs with independently random 
division of training set and probe set^^. 

For each user /, she may have certain number of links (corresponding to objects) in 
the probe set, we denote it as After the recommendation list (with length L) is 
generated for user /, we wiU calculate d^iL) as the number of her objects in the probe 
set which appear in the recommendation list. The recall of this user is defined as -R,(I) 

— di{L)IEi and the recall of the whole system is defined as R{L) - - N (L) . A 

higher recall value indicates a higher accuracy of the recommendation algorithm'^. 

Hybrid algorithm. When recommending objects for user /, the hybrid method works 
by assigning each object collected by user i one unit of resource. The initial resources 

are denoted by the vector / where/^^ is the resource possessed by object a. Then they 

will be redistributed via the transformation /' —Wf, where 



1 ap 



is the redistribution matrix, with fc^ 



kj — _^ Uj:, denoting the degree of object y. and user;', respectively. / is a tunable 

parameter which adjusts the relative weight between the mass diffusion algorithm 
(/ — 1) and heat conduction algorithm (A — 0)'*. 

User-based collaborative filtering. In the user-based collaborative filtering method, 
the basic assumption is that similar users usually collect the same objects. 
Accordingly, the recommendation score of object a for the target user i is 
pi^ — ^ 5,7flja> where N{i) is the top-K neighbors of user i and 5,-^ is their 

similarity. The cosine index is chosen to measure their similarity: — _ ^ 

where /c, is the degree of user /. 
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